On a generalization of margin-based discriminative training to robust speech recognition
نویسندگان
چکیده
Recently, there have been intensive studies of margin-based learning for automatic speech recognition (ASR). It is our believe that by securing a margin from the decision boundaries to the training samples, a correct decision can still be made if the mismatches between testing and training samples are well within the tolerance region specified by the margin. This nice property should be effective for robust ASR, where the testing condition is different from those in training. In this paper, we report on experiment results with soft margin estimation (SME) on the Aurora2 task and show that SME is very effective under clean training with more than 50% relative word error reductions in the clean, 20db, and 15db testing conditions, and still gives a slight improvement over conventional multi-condition training approaches. This demonstrates that the margin in SME can equip recognizers with a nice generalization property under adverse conditions.
منابع مشابه
Large-Margin Gaussian Mixture Modeling for Automatic Speech Recognition
Discriminative training for acoustic models has been widely studied to improve the performance of automatic speech recognition systems. To enhance the generalization ability of discriminatively trained models, a large-margin training framework has recently been proposed. This work investigates large-margin training in detail, integrates the training with more flexible classifier structures such...
متن کاملLarge Margin Training of Acoustic Models for Speech Recognition
LARGE MARGIN TRAINING OF ACOUSTIC MODELS FOR SPEECH RECOGNITION Fei Sha Advisor: Prof. Lawrence K. Saul Automatic speech recognition (ASR) depends critically on building acoustic models for linguistic units. These acoustic models usually take the form of continuous-density hidden Markov models (CD-HMMs), whose parameters are obtained by maximum likelihood estimation. Recently, however, there ha...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملA log-linear discriminative modeling framework for speech recognition
Conventional speech recognition systems are based on Gaussian hidden Markov models (HMMs). Discriminative techniques such as log-linear modeling have been investigated in speech recognition only recently. This thesis establishes a log-linear modeling framework in the context of discriminative training criteria, with examples from continuous speech recognition, part-of-speech tagging, and handwr...
متن کاملStructured Support Vector Machines for Speech Recognition
Discriminative training criteria and discriminative models are two eective improvements for HMM-based speech recognition. is thesis proposed a structured support vector machine (SSVM) framework suitable for medium to large vocabulary continuous speech recognition. An important aspect of structured SVMs is the form of features. Several previously proposed features in the eld are summarized in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008